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Abstract 

Let G be an n-vertex quasirandom graph with /o( T 2 l ) edges, and let W be a random walk 
on G of length (3n 2 . Let G' be the graph obtained from G by deleting the edges traversed 
by W. We show that (for fixed p and j3) with high probability G' is quasirandom with 
(e - W _|_ o(l))p(j£) edges. We also obtain a similar result when the random walk is replaced 
by a random homomorphism of a fixed tree with maximum degree c^/logn for a small constant 
c. This answers a question of Bottcher, Hladky, Piguet and Taraz that arose in the context of 
tree packing. 



1 Introduction 

Given a graph G and sets A,B C V(G) let E G (A,B) = {(a,b) £ A x B : ab £ E(G)} and let 
ec(A,B) = \Eq(A, B)\. A graph G with n vertices and p{Z) edges is e- quasirandom if 

|e G (A,5) - p\A\\B\\ < e\A\\B\ 

for all sets A, B C V(G) with |B| > en. Thus a quasirandom graph resembles a random graph 
with the same density, provided we do not look too closely. Quasirandom graphs were introduced 
by Thomason [lQj and have come to play a central role in probabilistic and extremal graph theory. 
The reader is referred to the excellent survey article by Krivelevich and Sudakov [5] for further 
details. 

Suppose that G' is a random subgraph of G in which each edge is included independently with 
a fixed probability p. It is easy to see that, for any r\ > e, the graph G' is with high probability 
77-quasirandom, provided n is large enough. Our main result is that a similar conclusion holds 
when the subgraph G' of G is chosen according to another natural distribution. 

A walk W on G of length I consists of a sequence of vertices W = Wo . . .W\ where WiWi + \ is 
an edge of G for all i < I. A random walk W of length I on G is obtained by choosing a start 
vertex Wo of G from some initial distribution (typically this is either specified precisely or chosen 
uniformly at random) and then, for each i < I, choosing Wj+i uniformly at random from the 
neighbours of W{, with each choice made independently. 
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Bottcher, Hladky, Piguet and Taraz [3] recently asked the following question. Let W be a random 
walk of length fin 2 on G and let G' be the graph obtained from G by deleting the edges of W. Is G' 
quasirandom with high probability? This question arose in the context of a random tree embed- 
ding procedure which sought to find disjoint copies of a given collection of trees in the complete 
graph K n . Bottcher, Hladky, Piguet and Taraz noticed that if removing randomly embedded 
copies of a small number of such trees from a quasirandom graph G preserves quasirandomness, 
it may be possible to nibble (or iteratively remove) many disjoint trees from K n . 

Our main result is that G' is quasirandom with high probability. We prove two versions of this 
statement. The first applies when the minimum degree of G is reasonably large. 

Theorem 1 (Bounded minimum degree). Let fi, e, p, n > with rj > e and let 7 = Ce 1//4 for some 
absolute constant C > 0. Let G be an n-vertex e- quasirandom graph with p{T\ edges and minimum 
degree at least jn, and let W be a random walk on G of length fin 2 . Then, with probability 1— o(l), 
the graph G' is n- quasirandom with (e~ 2/3 / p + o(l))p(n) edges. 

In fact, our proof gives more than this. It shows that with high probability ec{A, B) = [e~ 2 ^^ p + 
o(l))e G (A,B) for all sets A,B Q V(G) with e G (A,B) > n. Thus the subgraph spanned by W 
looks like a random subgraph of G in a very strong sense. 

The bound on the minimum degree of G means that G is well-connected and allows us to take 
advantage of a general result on the rate of convergence of a random walk. For general quasir- 
andom graphs such results do not hold, and at the start of Section [3] we give an example of a 
poorly-connected graph for which the number of edges in G' can take very different values with 
positive probability, depending on e but not on n. To obtain a version of Theorem [T] for such 
graphs we must therefore allow our probability to depend on e as well as n. We write o e (l) for a 
quantity that is less than /(e) for n sufficiently large, where /(e) — > as e — > 0. 

Theorem 2 (General case). Given fi,p,rj > there exists e > such that the following holds. 
Let G be an n-vertex e-quasirandom graph with p\Z) edges and let W be a random walk on G of 
length fin 2 starting at any vertex Wq of G with degree in [(p—e)n, (p+e)n]. Then, with probability 
1 — o e (l), the graph G' is 77- quasirandom with (e~ 2/3 ^ p + o e (l))pC£\ edges. 

It is easily seen that there are at least (1— 2e)ri choices for such a vertex Wo in G (see Proposition^, 
so a vertex of G selected uniformly at random satisfies the conditions of Theorem [2] with high 
probability. 

The proof of Theorem Q] is given in Section 2. We then extend the proof to the general case of 
Theorem [2] in Section 3 with some additional ideas. In Section 0] we discuss the extension of our 
methods to random homomorphisms of general trees. 

Since we will only prove asymptotic results we make a number of simplifying assumptions. We 
assume e is sufficiently small compared to the other parameters, and are only interested in state- 
ments for n sufficiently large. We omit notation indicating the taking of integer parts, and ignore 
questions of divisibility when breaking walks into pieces of a given size. 
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2 Bounded minimum degree 



In our proof of Theorem Q] we will take the following alternative perspective on the construction 
of a random walk W. For each vertex v, let L v be an infinite list of neighbours of v with each 
entry selected independently and uniformly at random. As before, choose Wo from some given 
distribution. Then, at every stage i, if the walk has just made its jth visit to vertex v, let Wj+i 
be the jth element of L v . It is easy to see that this gives the same distribution on random walks 
W as described in Section [TJ 

We will prove Theorem Q] in two stages. In Section 12.11 we show that with high probability W 
visits each vertex of G about as often as we expect. Observe that, using our alternative description 
of W, this tells us (roughly) how many elements from each list L v we used in the construction of 
W. In Section [2.21 we show that after removing the edges of G corresponding to these elements 
from the top of the lists L v we are left with a graph which strongly resembles a random subgraph 
of G of the appropriate density. 

2.1 The number of visits to each vertex 

To begin this subsection we recall some useful facts. A random walk W on a graph G is a Markov 
chain with transition matrix P given by 




l/d(u) if uv£E(G); 
if uv#E(G). 



Thus P is a normalised version of the adjacency matrix A where each row has been scaled by 
the degree of the corresponding vertex. The eigenvalues of P are all real; let these be Ai > A2 > 
■ • • > A n and write A = max(|A2| , |A„|). The first eigenvalue Ai of P is always equal to 1 and has 
a corresponding eigenvector ir = (ir v ) given by ir v = 2e(G) • This vector ir is called the stationary 
distribution of the walk W. It is well-known (for example see [6]) that if G is connected and 
non-bipartite then, for any initial distribution of Wo, the distribution of W% converges to ir as 
i — > 00 (i.e. P (Wi = v) — > ir v as % — > 00 for each v). The following standard result, which can 
read out of Jerrum and Sinclair [3], gives control on the rate of this convergence. 

Lemma 3. For any n-vertex graph G with minimum degree at least jn and any initial distribution 
on Wo, we have 

max IP (Wj =v) — 7rJ < CvA\ 
vev(G) 

for some Gy depending on r y. 

Now if G is a regular e-quasirandom graph then A is small on the scale of e. (This is because 
the 'spectral gap' of a quasirandom graph is large [2], and P is a scalar multiple of A when G is 
regular.) For a general e-quasirandom graph this need not be true: for example, if G contains a 
small connected component, then A = 1 (the 1-eigenspace is spanned by the stationary distribution 
of each connected component of G). Similarly, A can be very close to 1 if there is a small set of 
vertices that is only weakly connected to the rest of the graph. However, a lower bound on the 
minimum degree of G is enough to recover an upper bound on A. 
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Lemma 4. Let G be an n-vertex e-quasirandom graph with minimum degree at least jn where 
p, 7 > Ce 1//4 for some absolute constant C > 0. Then, for n sufficiently large, A < 1/2. 

Before proving Lemma [H we make the following simple observation about quasirandom graphs 
which we will use repeatedly. 

Proposition 5. Let G be an n-vertex e-quasirandom graph with p(^j edges, and let X be a set 
of vertices with \X\ > en. Let Y = {v G V{G) : \e(v,X) - p\X\\ > e\X\}. Then \Y\ < 2en. 

Proof. We have Y = Y~ U Y + where 

y+ = { v G V{G) : e(v,X) > p\X\ + e\X\}, 
Y- = {ve V(G) : e(v,X) < p\X\ - e\X\}. 

Clearly 



jr+j 


> e 


\X\ 


\Y+\ 




> e 


\X\ 


\Y \ 



But then, since G is e-quasirandom and \X\ > en, we must have \Y + \, \Y \ < en. □ 

In particular, taking X = V(G) there are at least (1 — 2e)n vertices v of G with \d(v) — pn\ < en. 
We will call such vertices balanced. 

Proof of Lemma The proof follows a well-known argument (see for example [2] ) . We first 
estimate the number of labelled copies of C4 in G, and then evaluate the trace of P 4 in two 
different ways. Note that the implicit constants in our use of O(-) notation here are absolute. 

The number of labelled copies of C4 in G is 



2-(l + 0(e))n ■ (1 + 0(e))n ■ (' (,> + °}^ n \ + 0(e)„ 2 ^ n 



• n + + ( , , 

2 y w V 2 

= (p + 0(e))V + 0(e)n 4 
= (l + 0(e/p 4 )) A 4 - 

where the main term here accounts for balanced vertices u and v with close to p 2 n common 
neighbours, and the error term bounds the contribution to the sum from each other pair by (2) • 

Now the trace of P 4 is a weighted sum of the closed walks of length 4 in G, where the weight of 
the closed walk uvwx is l/(d(u)d(v)d(w)d(x)). Thus 

^ 4 (l + Q(6/p 4 ))A 4 , Q(e)n 4 Q(n 3 ) 
vmo) {{ P + 0{e))nY + ( 7 n)4 + ( 7 n) 4 

= 1 + 0(6//) + 0( e / 7 4 ) + 0(l/( 7 4 n)) , 
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where the main term counts the contribution from 4-cycles containing only balanced vertices and 
the error terms account for the contributions from 4-cycles with at least one unbalanced vertex 
and from closed walks of length 4 which are not 4-cycles respectively. (The lower bound on the 
minimum degree of G gives an upper bound of l/(7n) 4 for the weight of any one walk.) But we 
also have 

n n 

ueV(G) i=l i=2 

from which it follows that 

A 4 < it A 4 = 0(e/ P 4 ) + 0(e/ 7 4 ) + 0(l/( 7 4 n)) < 1/16, 

for p, 7 > Ce 1//4 and n sufficiently large. □ 

For the next lemma we will need to approximate one probability measure by another on the same 
space. Given a finite probability space S7, the total variation distance between two probability 
measures \x\ and \xi is defined by 



d TV {pi,p 2 ) = \^Z 



wen 

This is the amount of probability mass that would have to be moved to turn one distribution into 
the other. 

Combining Lemma [3] with Lemma HI it is easy to see that the total variation distance between 
Wt and a vertex sampled from the stationary distribution is small when t is moderately large. In 
fact, we get much more. 

Let L = (logra) 2 and let K = f3n 2 /L. Given i < L, let WW denote the subsequence of W 
obtained by starting from W{ and taking L steps at a time: that is, = (W^ . . . , Wg) where 
Wj = Wi+jL for all j < K. For each v G V(G) let Xy denote the random variable which counts 

the number of times visits v. Our next lemma shows that with high probability X^ is close 
to its mean. 

Lemma 6. Let G be a graph satisfying the conditions of Lemma^ and v E V(G). Then we have 



I V 



Proof. Let p = ir K be the if-fold product measure of tt on V{G) K ; that is, p(w) = \\f = i n Wi for 
w G V(G) K . By Lemma [3] and Lemma HI we have 

P (V W = w) = P (W? = wxj P (w 2 (i) = w 2 \wf ] = wA • • -P (W { £ = w K \W$_ x = w K -^j 

= (n wi + 0(2-(^«) 2 )) (n W2 +0(2-^ 2 )) ■ ■ ■ (n WK +0(2'^ 

= (ir m + 0(n- 6 )) (ir W2 + 0(n" 6 )) • • • (ir WK + 0(rT 6 )) 
= (l + 0(n~ 3 ))/iH, 
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since < ir v < for all u and K = 0(n 2 ). Summing over all w then gives that 

d TV (F,(i) = 0(«i, 
where P is the measure on V{G) K induced by W®. Now let 




By Chernoff's inequality (see [1, A.l.ll and A.1.13]) 



fi(A) <2e-( 4+ °«) lo s 




Since P (A) < fx(A) + cZtv( f > /■*), tne result follows. 



□ 




n-vertex e-quasirandom graph with minimum degree at least yn, and let W be a random walk on 
G of length fin 2 . Then 



2.2 The number of edges removed from subgraphs 

Recall that by the definition of the L v , for each v G V(G) and u G L v we have uv G E{G). Thus 
each entry of a list L„ corresponds to an edge of G. For every v > 0, we let Gu st (v) denote the 
random subgraph of G obtained by deleting the edges of G corresponding to the first ud{v) entries 
in each list L v . By Corollary we have that G hst (^ + o(l)) C G' C G iist (| - o(l)) with high 
probability. It therefore suffices to study the edge distribution of Gn st (v). 

We will show that, on large scales, Gu s t(v) has the same edge distribution properties as a random 
graph of the appropriate density. Our next lemma calculates this density. 

Lemma 8. The probability that an edge of G is retained in Gu s t(v) is e~ 2v + o(l). 

Proof. For the edge uv to be retained in Gnst{y), v must not appear in the first ud{u) entries 
of L u , and u must not appear in the first vd{v) entries of L v . Hence the probability that uv is 
retained in Gu s t(v) is 



\ p V Ktt v p J 

Thus with high probability the number of visits W makes to each v G V(G) is ( ^ + o(l)J d{v) 





since d(u), d{v) > yn. 



□ 
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To show that the number of edges retained in any subgraph is close to its expectation we use 
Talagrand's concentration inequality [9j. In its usual form Talagrand's inequality is asymmetric 
and bounds a random variable in terms of its median. We use the following symmetric version 
(see Chapter 20]) that gives concentration of the random variable about its mean. 

Theorem 9. Let Vl = YliLi ^ e a P r °duct of probability spaces with the product measure. Let 
X be a random variable on Q such that 

(i) \X(uS) — X(oj')\ < c whenever u and u' differ on only a single coordinate for some constant 
c > 0; 

(ii) whenever X(u) > r there is a set I C {1, ... ,N} with \I\ = r such that X(uj') > r for all 
w'ed with uj'i = oji for all i E I. 

Then for < s < E(X), 

P (\X - E (X) | > s + 60c v / E{X)^ < 4 e -^ 2 /8 C 2 E(X)^ 

Lemma 10. Let G be an n-vertex e-quasirandom graph with p{^j edges and let W be a random 
walk on G of length fin 2 . Then with probability at least 1 — o(l), for all A, B C V(G) with 
e G (A,B) > n, we have e Glist{u) (A, B) = (e~ 2v + o(l))e G (A, B). 

Proof. We apply Theorem [9] to the space Q = Yl v eV(G) 11^=1 N(v), where each neighbourhood 
has the uniform probability measure; we can view as the space of choices for the first vd{v) 
entries of each list L v . Let A,BC V(G) be sets in G with ea(A,B) 3> n. Let G rem denote the 
subgraph of G consisting of those edges which are removed from G to obtain Gu s t{v) and consider 
the random variable Xa,b = e G re m(A-^)- It is easy to see that Xa,b satisfies the conditions of 
Talagrand's inequality. Indeed, (i) holds since changing a list entry can change Xa,b by at most 
c = 2. Furthermore, (ii) holds since if Xab > s, there are s list entries witnessing this fact. 
Therefore, by Theorem El for 120^ (X a ,b) <t<E (X a ,b) we have 

F(\X a ,b -E(X a ,b)\ > 2t) < 4e-* 2/32E ( x ^ s ). 

But by LemmaE] we have E (X a ,b) = (1 - e~ 2v + o(l))e G (A, B) > n. Taking t = C'^nE (X a ,b) 
(= o(E (Xa,b))) for large enough C' > gives that 

F(\X A , B -(l-e- 2u + o(l))e G (A,B)\ >2t) <8~ n . 

But there are at most 2 n choices for A and 2 n choices for B. Therefore, with probability at least 
1 - 2" n we have X a ,b = (1 - e~ 2l/ + o(l))e G (A, B), for all pairs (A, B) with A,B C V{G) and 
e G (A,B) 3> n. Since eQ lu j v \{A, B) = e G (A,B) — Xa,b the result follows. □ 

3 General case 

We now move to the case of a general e-quasirandom graph G with edge density p. Such G must 
always contain a connected component of order at least (1 — e)n (as otherwise we can find two 
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large sets with no edges between them), so by restricting our walk to this component we can 
assume that G is connected. 

The extra difficulty in the general case is that there might be small sets of vertices that are only 
weakly connected to the rest of the graph in which the random walk can get 'stuck'. For example, 
let G be a graph consisting of a small clique of order e 2 n/2 joined to a large clique of order 
(1 - e 2 /2)n by a single edge. Then G is e-quasirandom, but it is not even true that the number 
of edges in G' is concentrated near some value. Indeed, if we start our random walk in the large 
clique then with positive probability (depending on e but not on n) W will lie entirely within the 
large clique, but there is also positive probability (depending on e but not on n) that W will cross 
to the small clique in the first en 2 steps and remain there. So for general quasirandom graphs we 
cannot hope for as strong a result as Theorem and our assertions about high probability will 
necessarily depend on e as well as n. In this section we use 'with high probability' to mean 'with 
probability 1 — o e (l)', with o e (l) small (depending on e) for large n as defined in Section [U 

Our task in this section is to find a weaker replacement for Corollary [7J in Section 2.1. Recall 
that we call a vertex v balanced if \d(v) — pn\ < en. We will show that if W is a random walk of 
length /3n 2 on G with Wo balanced, then with high probability W hits most vertices of G about 
the right number of times. The results in Section T2.2I can then be used to prove Theorem [2] in the 
same way that Theorem Q] was deduced from Corollary [7J 

Our first lemma gives a lower bound on the probability that a given step of a random walk W is in 
a set S C V(G). Write lx for the indicator function of a set X and l v for the indicator function 
of the set {v}. Note that if the initial distribution for Wq is tt then P (Wj 6 S) = X^es ' Kv = 71 ' ^ s 
for any set S C V(G) when i > 0. The next result shows that this is still almost true if W starts 
from a balanced vertex, S is large and i>2. 

Lemma 11. Let G be a connected n-vertex e-quasirandom graph with p{^J edges and v be a 
balanced vertex. Let S C V(G) with \S\ > en. Then for a random walk W starting at v we have 



for i > 2 and n sufficiently large. 

Proof. We first show that the random walk is quite well mixed after only two steps. Let A be 
the set of neighbours of v with degree at most (p + e)n and B be the set of vertices with at 
least (p — e)\A\ neighbours in A — the 'well-behaved' first and second neighbourhoods of v. By 
e-quasirandomness, |^4| > d(v) — en > (p — 2e)n and \B\ > (1 — e)n. We have 



IP (Wi G 5) > 7T • l s - 8^/p > \S\/n 



1 



1 



1„P = 



d{v) 



> 



(p + e)n 



1a 



where the inequality holds in each coordinate. For x G B 



xy£E(G) 



d(y) 



1 



> 



{p-e){p-2e)n 
(p + e)n 



> P(l " 4e/p) 
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where the first inequality holds since each y G A has degree at most (p + e)n, x has (p — e)\A\ 
neighbours in A and \A\ > (p — 2e)n. Since the entries of P are non-negative we can compose 
these inequalities to obtain 

up > S-m lB . 

n 

Let b = ( 1-5e / p ) l B . Since 7r s = , if x is a balanced vertex then ( 1-£ / p ) < vr^ < 

ra -D X 2pf ™) — — 71—1 ' 

otherwise we have the weaker bound ir x < Since at most 2en vertices are unbalanced and at 
most en vertices are not in B, 

> 2 / 9 \ 2 \ 1/2 /fi/l,\V2 



b — vr|| 2 < I n( — I 



pn J \pn J J \p 2 n J 



Then, for % > 2, 



'(W i eS) = l v P% 



= i v P 2 ■ P^is 
> bP^ls 

= TTP l - 2 l S + (b-TT)P i - 2 l S . 

By Cauchy-Schwarz, and the fact that the eigenvalues of P are at most 1, 



and so 



||(b - K)P*- 2 lsh < lib - tt|| 2 ||1 S || 2 < < 8V~e/p, 

\ p z n J 



P(Wi€5)>ir-l s -8Vi/p, 
proving the first inequality. Since at least IS") — 2en elements of S are balanced, 

•■1» = ES^ " S| - 2 "" (/> " f) > \S\ln - 2, - ,,p >_\S\ln- Vi/ P , 

x£S ZP ^) P H 

which proves the second inequality. □ 



We now consider the following variant of the list model for constructing a random walk. Fix some 
small length L and let K = f3n 2 /L. By a block rooted at v we mean a random walk of length L 
starting at v. For each vertex v, let be an infinite list of blocks rooted at v. We construct 
a random walk of length /3n 2 as follows. Choose Wo from the given initial distribution, and, at 
each stage s = 1, . . . , K, let W( s _i)l • • • W s l be the first unused block rooted at W( s _i)£. At the 
end of the construction we have examined K blocks in total from the top of the n lists. Let M 
be the set of blocks examined (equivalently, the multiset of roots of blocks used). 

This construction generalises the simple list model (which corresponds to the case L = 1), and we 
again hope to exploit the independence of blocks by applying standard concentration inequalities. 
There are two main obstacles. One is that we do not know anything about the distribution of a 
block rooted at a vertex v which is not balanced. We therefore first show that most of the root 
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Ai A 2 A 3 



A n -i A„ 



Figure 1: The construction examines K blocks from the top of the lists A„, but we cannot tell in 
advance which blocks these will be. 



vertices are balanced. The second obstacle is that we do not know in advance which set of blocks 
we will examine. We handle this by approaching the problem from the other direction: for a 
given multiset M, what is the probability that the corresponding blocks do not contain an even 
distribution of the vertices? This turns out to be small enough that summing over all possible M 
gives the bound we require. 

Lemma 12. Let G be a connected n-vertex e-quasirandom graph with p(^j edges and let W be 
a random walk of length (3n 2 starting at a balanced vertex of G. Let 5 = and suppose 

that n is sufficiently large. Then with probability at least 1 — 3<5 there exists a set B C V(G) with 
\B\ > (1 — 5)n such that each vertex in B is hit at least (1 — 4<5)/3n times by W . 

Proof. Take L = uj(n) for any u(n) <C ra/logra which tends to infinity as n — > oo, and let 
K = fin 2 / L. Construct a random walk W as described above and let x\, . . . , xk be the roots of 
the K blocks used. We first show that with high probability many of the vertices {x%, . . . ,xk} 
are balanced. 

Let U be the number of xi that are unbalanced. By Lemma [Tl] for i > 2, 

P {xi is unbalanced) < 1 - ((1 - 2e) - 5 2 ) < 25 2 , 
since there are at least (1 — 2e)n balanced vertices and 5 2 > 2e. By Markov's inequality, 



> (f/ > Wl -)<m< 2 « 



5K 



5K 



25. 



Now let M be a multiset of (1 - 5)K balanced vertices and let W^^W 1 - 2 ^,. . . , W^ 1 ' 5 ^ be the 
corresponding blocks. We will show that the probability that these blocks contain most balanced 
vertices about the right number of times is large. 

Let S C V(G) with \S\ > 8n. By Lemma [TTl for every 1 < i < (1 — 5)K and every j > 2 we have 
G S] >5-S 2 . Let Xa be the indicator of the event wP G 5, let Xj = £JL Xu and 



let Xm,S = Ylf=iXj- For fixed j the X^ are independent, so by Chernoff's inequality (see [TJ 
Appendix A]), 

P (Xj <(5 - 25 2 )\M\) < e ~ 25i \ M \. 
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Hence 



P [X M ,s <{S- 45 2 )/3n 2 ) < P (X M ,s <(S- 35 2 )(1 - 8)KL) 

< F (Xj <(5 - 25 2 )\M\ for some 2 < j < L) 

where the second inequality holds for large n because the contribution from X\ is negligible as 
L — > oo. 

If the random walk W fails to hit at least (1 — 5)n vertices at least (1 — 45)f3n times each then 
either 5K of the xi are unbalanced or there is an M and an S such that Xm s < ~ 4<5 2 )/3n 2 . 
But the probability of this bad event is at most 

P (V > IK) + £ £ ^- M4 ' M| S 2* + (* „ + " - X ) Le-^-W 

MS V / \— / 

<25 + 0(K) n ■ 2 n ■ L ■ e - 2Si d-5)K 

<25 + exp(0(nlogn) + O(n) + O(logn) - 2J 4 (1 - 

< 35, 

for n sufficiently large, since K 3> n log re. □ 
We now have everything we need to complete the proof of Theorem [2j 

Proof of Theorem^ We will show that with probability 1 — o e (l) the graph G' obtained from 
G by removing the edges of W is close to Gu s t ((3/p). It then follows from Lemma [10] that G' is 
7/-quasirandom with probability 1 — o e (l). 

Since there are at most 2en < 5 unbalanced vertices in G, by Lemma [12] with probability at 
least 1 — 35 there is a set B of (1 — 25) n balanced vertices such that every v E B is hit at least 
(1 - 45)f3n > (1 - 55)^d(t;) times by W. This accounts for (1 - 25)n ■ (1 - 45)/3n > (1 - 75)f3n 2 
of the list entries examined, so G' differs from Gu s t(f3/p) by at most 14(5 fin 2 edges. Since 5 tends 
to with e, the result follows. □ 



4 Trees 

A homomorphism from a graph H to a graph G is an edge-preserving map <f> : V(i?) —> V(G). 
A random walk can be viewed as a random homomorphism of a path; a natural generalisation is 
to consider a random homomorphism of some other tree T0 Just as we traversed a path in one 
direction, our trees will be rooted and we think of them as directed 'downwards', away from the 
root. In this section we will explore to what extent the methods of Section [3] can be applied in 
this more general setting. 

We generate a random homomorphism as follows. Enumerate the vertices of T as Vo, vi, . . . , 
where, for each j, T[vq, . . . ,Vj] is a connected subtree of T containing the root vq. First choose 

1 Sometimes called a tree-indexed random walk. 
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4>(vq) from a given initial distribution. Then, at each stage j > 0, let u be the parent of Vj 
in T and choose <fi(vj) uniformly at random from the neighbours of (j)(u). All choices are made 
independently, and we can think of these choices as being taken from the lists L v as before. 

Suppose now that G is an e-quasirandom graph on n vertices. Let eft be a random homomorphism 
of a tree T of size fin 2 to G, and let G' be the graph obtained from G by deleting the edges of 
<j)(T). Is G' quasirandom with high probability? It is easy to see that in general the answer is 
no. For example, let G = K n and T be an n/2-ary tree of depth 2 (here f3 = 1/4 + o(l)). Then 
with high probability <f>(T) contains a constant fraction of the edges of G. But all of these edges 
are incident on the neighbourhood of the root, which has (1 — e _1//2 + o(l))n vertices with high 
probability, so with high probability G' is not quasirandom. 

We seek conditions on T such that we can apply the approach taken in Section [3] with minimal 
changes. The condition we give here imposes an upper bound on the maximum degree of T. 

We need an analogue of the second model for the construction of a random walk. Instead of 
breaking our path into many short paths, we break our tree into many small edge-disjoint subtrees. 

Lemma 13. Let T be a rooted tree with N edges and let L < N . Then T can be written as an 
edge- disjoint union of rooted trees R±, . . . , Rk, each of size between L and 3L. 

Proof. Let v be a vertex of T furthest from the root such that v has at least L descendants. Then 
each branch of T lying below v has at most L edges, so some union of these branches has size 
between L and 2L; let this be R\. We obtain R2, . . . , Rk similarly until there are less than L 
edges of T remaining, which we add to Rk- □ 

Write 1Z = {Ri, . . . ,Rk} for the corresponding set of abstract rooted trees, up to isomorphism. 
In an abuse of notation we use Ri to refer to both the specific subtree of T and its isomorphism 
type. 

It is convenient to number the Ri such that R\ U • • • U Rj is a subtree of T containing the root for 
each j. We can then describe the second model for the construction of a random homomorphism as 
follows. For each v £ V(G) and R 6 7Z, let be a list of independent random homomorphisms 
from R to G that map the root of R to v. Choose a vertex v\ from the given distribution for the 
image of the root of T and identify (j){Ri) with the first entry from A„ 1> ^ 1 . (If R\ has a non-trivial 
automorphism group then there is a choice of identification of R\ with the reference copy in 1Z. 
The choice is unimportant provided the same choice is made every time.) Then at each stage j 
we have already determined the image Vj of the root of Rj, and we identify 4>(Rj) with the first 
unused element from A Vj ^ . . 

Now let T be a rooted tree with j3n 2 edges. As before we want to show that T 'visits' most vertices 
of G about the right number of times. We need to be careful here about what counts as a 'visit': 
what we want to count is the number of times an edge leaves a vertex, as that is the number of 
entries of the corresponding list that will be examined. So we say <fi(T) visits x G V{G) whenever 
uv is an edge of T with u the parent of v and 4>(u) = x; the number of visits 4>(T) makes to x is 
the number of edges uv for which this occurs. 

There are three places where the argument in the proof of Lemma [12] needs modification or 
additional details need to be checked. 
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(i) In the path case the edges (or vertices) of the blocks had a natural order and the blocks 
were all the same size. In the tree case we are free to choose a labelling of the edges in each 
block, but the blocks might still have different sizes: when we look at the 2Lth edge from 
each block, are there enough blocks with 2L edges that Chernoff 's inequality will give good 
concentration? 

(ii) In the path case the set of list entries examined was parameterised by multisets of vertices 
of G. In the tree case the set of list entries examined is instead parameterised by multisets 
of pairs (v, R) with v G V(G) and R G 1Z. So the factor ( K ^^ 1 ) in the final sum needs to 
be replaced by (^^i'j" 1 ) > and we must restrict the size of \TZ\ to prevent this becoming 
too large. 

(iii) In the path case we had to ignore the first two vertices of each block as we needed to take 
two steps before we had good information about the distribution over vertices. This was 
safe because the ignored vertices were only a o(l) fraction of the total number of vertices. 
In the tree case we must ignore the edges whose start point is the root of the block or is a 
child of the root. We need to ensure that the number of ignored edges is at most a small 
fraction of the total number of edges. 

Problem (i) is avoided by throwing away the small number of edges that receive a label shared 
by few other edges. If we throw away all edges that receive a label which is used less that en 2 /L 2 
times then the total number of edges thrown away is less than 3en 2 /L as there are at most 3L 
edges in each block. 

K 



3L 



err 
L 2 



Figure 2: Deleting a o(l) fraction of the edges ensures that the remaining labels i are each used 
in a large number of blocks. 



Problem (ii) is avoided by taking L small: L 

trees on L vertices is 0((2.9955 . . .) L ) (see |i 
n\TZ\ < ra 3/2 < K, and 



log n 



suffices. Indeed, since the number of rooted 



< K < ^jt, we have in this case that 



K + n\K\ - 1 
n\Tl\ - 1 



; n 
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which is small enough that it will not overpower the e -type decay. 

Problem (iii) is avoided by having A 2 , the square of the maximum degree of T small (depending 
on the desired level of quasirandomness) compared to L: so A can be as large as a small multiple 
of \J\ogn. 

With these modifications to our earlier argument we obtain the following result. 

Theorem 14. Given j3, p,n > there exists e, c > such that the following holds. Let G be an 
n-vertex e-quasirandom graph with p(£) edges, T be a rooted tree of size fin 2 with maximum degree 
A < c-v/log n and let (f> be a random homomorphism from T to G such that the image of the root 
is balanced. Then, with probability 1 — o e (l), the graph G' formed by removing the edges of (p(T) 
from G is n-quasirandom with (e~ 2 ^/ p + o e (l))p(^j edges. 

It would be interesting to know how large A(T) can be taken in Theorem Q3J By the example at 
the start of this section we must have A(T) small compared to re. Is this already enough? 
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